Goto

Collaborating Authors

 Comal County


Emotion-Guided Image to Music Generation

Kundu, Souraja, Singh, Saket, Iwahori, Yuji

arXiv.org Artificial Intelligence

Generating music from images can enhance various applications, including background music for photo slideshows, social media experiences, and video creation. This paper presents an emotion-guided image-to-music generation framework that leverages the Valence-Arousal (VA) emotional space to produce music that aligns with the emotional tone of a given image. Unlike previous models that rely on contrastive learning for emotional consistency, the proposed approach directly integrates a VA loss function to enable accurate emotional alignment. The model employs a CNN-Transformer architecture, featuring pre-trained CNN image feature extractors and three Transformer encoders to capture complex, high-level emotional features from MIDI music. Three Transformer decoders refine these features to generate musically and emotionally consistent MIDI sequences. Experimental results on a newly curated emotionally paired image-MIDI dataset demonstrate the proposed model's superior performance across metrics such as Polyphony Rate, Pitch Entropy, Groove Consistency, and loss convergence.


The Role of Language Models in Modern Healthcare: A Comprehensive Review

Khalid, Amna, Khalid, Ayma, Khalid, Umar

arXiv.org Artificial Intelligence

The application of large language models (LLMs) in healthcare has gained significant attention due to their ability to process complex medical data and provide insights for clinical decision-making. These models have demonstrated substantial capabilities in understanding and generating natural language, which is crucial for medical documentation, diagnostics, and patient interaction. This review examines the trajectory of language models from their early stages to the current state-of-the-art LLMs, highlighting their strengths in healthcare applications and discussing challenges such as data privacy, bias, and ethical considerations. The potential of LLMs to enhance healthcare delivery is explored, alongside the necessary steps to ensure their ethical and effective integration into medical practice.


Detecting Cross-Modal Inconsistency to Defend Against Neural Fake News

Tan, Reuben, Saenko, Kate, Plummer, Bryan A.

arXiv.org Artificial Intelligence

Large-scale dissemination of disinformation online intended to mislead or deceive the general population is a major societal problem. Rapid progression in image, video, and natural language generative models has only exacerbated this situation and intensified our need for an effective defense mechanism. While existing approaches have been proposed to defend against neural fake news, they are generally constrained to the very limited setting where articles only have text and metadata such as the title and authors. In this paper, we introduce the more realistic and challenging task of defending against machine-generated news that also includes images and captions. To identify the possible weaknesses that adversaries can exploit, we create a NeuralNews dataset composed of 4 different types of generated articles as well as conduct a series of human user study experiments based on this dataset. In addition to the valuable insights gleaned from our user study experiments, we provide a relatively effective approach based on detecting visual-semantic inconsistencies, which will serve as an effective first line of defense and a useful reference for future work in defending against machine-generated disinformation.


Police Used Bomb Disposal Robot To Kill A Dallas Shooting Suspect

#artificialintelligence

In the wake of post-protest shootings that left five police officers dead and seven others wounded, along with two civilians, police traded gunfire last night with a suspect inside a downtown Dallas parking garage. Eventually, law enforcement sent a "bomb robot" (most likely shorthand for a remotely controlled bomb disposal robot) armed with an explosive, to the suspect's location, then detonated the explosive, killing the suspect. "We saw no other option but to use our bomb robot and place a device on its extension for it to detonate where the suspect was…other options would have exposed our officers to great danger," said Dallas Police Chief David O. Brown. "The suspect is deceased as a result of detonating the bomb." Repurposing a robot that was created to prevent death by explosion clearly contrasts with the way these machines are normally used. Bomb disposal robots are routinely used to minimize the potential of harm to officers and civilians when disarming or clearing potential explosives from an area.